index vector
Semi-Parametric Batched Global Multi-Armed Bandits with Covariates
The multi-armed bandits (MAB) framework is a widely used approach for sequential decision-making, where a decision-maker selects an arm in each round with the goal of maximizing long-term rewards. Moreover, in many practical applications, such as personalized medicine and recommendation systems, feedback is provided in batches, contextual information is available at the time of decision-making, and rewards from different arms are related rather than independent. We propose a novel semi-parametric framework for batched bandits with covariates and a shared parameter across arms, leveraging the single-index regression (SIR) model to capture relationships between arm rewards while balancing interpretability and flexibility. Our algorithm, Batched single-Index Dynamic binning and Successive arm elimination (BIDS), employs a batched successive arm elimination strategy with a dynamic binning mechanism guided by the single-index direction. We consider two settings: one where a pilot direction is available and another where the direction is estimated from data, deriving theoretical regret bounds for both cases. When a pilot direction is available with sufficient accuracy, our approach achieves minimax-optimal rates (with $d = 1$) for nonparametric batched bandits, circumventing the curse of dimensionality. Extensive experiments on simulated and real-world datasets demonstrate the effectiveness of our algorithm compared to the nonparametric batched bandit method introduced by \cite{jiang2024batched}.
- North America > United States > Pennsylvania (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- Asia > Middle East > Republic of Türkiye (0.04)
Ltri-LLM: Streaming Long Context Inference for LLMs with Training-Free Dynamic Triangular Attention Pattern
Tang, Hongyin, Xiu, Di, Wang, Lanrui, Geng, Xiurui, Wang, Jingang, Cai, Xunliang
The quadratic computational complexity of the attention mechanism in current Large Language Models (LLMs) renders inference with long contexts prohibitively expensive. To address this challenge, various approaches aim to retain critical portions of the context to optimally approximate Full Attention (FA) through Key-Value (KV) compression or Sparse Attention (SA), enabling the processing of virtually unlimited text lengths in a streaming manner. However, these methods struggle to achieve performance levels comparable to FA, particularly in retrieval tasks. In this paper, our analysis of attention head patterns reveals that LLMs' attention distributions show strong local correlations, naturally reflecting a chunking mechanism for input context. We propose Ltri-LLM framework, which divides KVs into spans, stores them in an offline index, and retrieves the relevant KVs into memory for various queries. Experimental results on popular long text benchmarks show that Ltri-LLM can achieve performance close to FA while maintaining efficient, streaming-based inference.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Simsek, Berfin, Bendjeddou, Amire, Hsu, Daniel
This work focuses on the gradient flow dynamics of a neural network model that uses correlation loss to approximate a multi-index function on high-dimensional standard Gaussian data. Specifically, the multi-index function we consider is a sum of neurons $f^*(x) \!=\! \sum_{j=1}^k \! \sigma^*(v_j^T x)$ where $v_1, \dots, v_k$ are unit vectors, and $\sigma^*$ lacks the first and second Hermite polynomials in its Hermite expansion. It is known that, for the single-index case ($k\!=\!1$), overcoming the search phase requires polynomial time complexity. We first generalize this result to multi-index functions characterized by vectors in arbitrary directions. After the search phase, it is not clear whether the network neurons converge to the index vectors, or get stuck at a sub-optimal solution. When the index vectors are orthogonal, we give a complete characterization of the fixed points and prove that neurons converge to the nearest index vectors. Therefore, using $n \! \asymp \! k \log k$ neurons ensures finding the full set of index vectors with gradient flow with high probability over random initialization. When $ v_i^T v_j \!=\! \beta \! \geq \! 0$ for all $i \neq j$, we prove the existence of a sharp threshold $\beta_c \!=\! c/(c+k)$ at which the fixed point that computes the average of the index vectors transitions from a saddle point to a minimum. Numerical simulations show that using a correlation loss and a mild overparameterization suffices to learn all of the index vectors when they are nearly orthogonal, however, the correlation loss fails when the dot product between the index vectors exceeds a certain threshold.
Model-free Subsampling Method Based on Uniform Designs
Zhang, Mei, Zhou, Yongdao, Zhou, Zheng, Zhang, Aijun
Subsampling or subdata selection is a useful approach in large-scale statistical learning. Most existing studies focus on model-based subsampling methods which significantly depend on the model assumption. In this paper, we consider the model-free subsampling strategy for generating subdata from the original full data. In order to measure the goodness of representation of a subdata with respect to the original data, we propose a criterion, generalized empirical F-discrepancy (GEFD), and study its theoretical properties in connection with the classical generalized L2-discrepancy in the theory of uniform designs. These properties allow us to develop a kind of low-GEFD data-driven subsampling method based on the existing uniform designs. By simulation examples and a real case study, we show that the proposed subsampling method is superior to the random sampling method. Moreover, our method keeps robust under diverse model specifications while other popular subsampling methods are under-performing. In practice, such a model-free property is more appealing than the model-based subsampling methods, where the latter may have poor performance when the model is misspecified, as demonstrated in our simulation studies.
- Asia > China > Hong Kong (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- North America > United States > New York (0.04)
- (3 more...)
Estimating covariance and precision matrices along subspaces
We study the accuracy of estimating the covariance and the precision matrix of a $D$-variate sub-Gaussian distribution along a prescribed subspace or direction using the finite sample covariance with $N \geq D$ samples. Our results show that the estimation accuracy depends almost exclusively only on the components of the distribution that correspond to desired subspaces or directions. This is relevant for problems where behavior of data along a lower-dimensional space is of specific interest, such as dimension reduction or structured regression problems. As a by-product of the analysis, we reduce the effect the matrix condition number has on the estimation of precision matrices. Two applications are presented: direction-sensitive eigenspace perturbation bounds, and estimation of the single-index model. For the latter, a new estimator, derived from the analysis, with strong theoretical guarantees and superior numerical performance is proposed.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
Multi-owner Secure Encrypted Search Using Searching Adversarial Networks
Chen, Kai, Lin, Zhongrui, Wan, Jian, Xu, Lei, Xu, Chungen
Searchable symmetric encryption (SSE) for multi-owner model draws much attention as it enables data users to perform sear ches over encrypted cloud data outsourced by data owners. However, im plement-ing secure and precise query, efficient search and flexible dyn amic system maintenance at the same time in SSE remains a challenge. To ad dress this, this paper proposes secure and efficient multi-keyword ranked search over encrypted cloud data for multi-owner model based on sea rching adversarial networks. We exploit searching adversarial netw orks to achieve optimal pseudo-keyword padding, and obtain the optimal gam e equilibrium for query precision and privacy protection strength. M aximum likelihood search balanced tree is generated by probabilistic l earning, which achieves efficient search and brings the computational compl exity close to O (log N). In addition, we enable flexible dynamic system maintenanc e with balanced index forest that makes full use of distribute d computing. Compared with previous works, our solution maintains query precision above 95% while ensuring adequate privacy protection, and i ntroduces low overhead on computation, communication and storage.
Nonlinear generalization of the single index model
Kereta, Zeljko, Klock, Timo, Naumova, Valeriya
Single index model is a powerful yet simple model, widely used in statistics, machine learning, and other scientific fields. It models the regression function as $g()$, where a is an unknown index vector and x are the features. This paper deals with a nonlinear generalization of this framework to allow for a regressor that uses multiple index vectors, adapting to local changes in the responses. To do so we exploit the conditional distribution over function-driven partitions, and use linear regression to locally estimate index vectors. We then regress by applying a kNN type estimator that uses a localized proxy of the geodesic metric. We present theoretical guarantees for estimation of local index vectors and out-of-sample prediction, and demonstrate the performance of our method with experiments on synthetic and real-world data sets, comparing it with state-of-the-art methods.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- North America > Canada > Alberta > Census Division No. 19 > Birch Hills County (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
Random Indexing K-tree
De Vries, Christopher M., De Vine, Lance, Geva, Shlomo
The purpose of this paper is to present and analyse the combination of Random Indexing (RI) with the K-tree algorithm. Both RI and K-tree adapt to changing data and decrease the cost of computationally intensive vector based applications. This combination is particularly suitable to the representation and clustering of very large document collections. Documents are typically represented in vector space as very sparse high dimensional vectors. RI can reduce the dimensionality and sparsity of this representation. In turn, the condensed representation is highly effective when working with K-tree. The paper is focused on determining the effectiveness of using RI with K-tree through experiments and comparative analysis of results. Sections 2 to 6 discuss K-tree, Random Indexing, Document Representation, Experimental Setup and Experimental results respectively. The paper ends with a conclusion in Section 7.
- North America > United States > New York > New York County > New York City (0.04)
- Oceania > Australia > Queensland > Brisbane (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (2 more...)